Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

An image-based automatic Arabic translation system

Identifieur interne : 000A93 ( Main/Exploration ); précédent : 000A92; suivant : 000A94

An image-based automatic Arabic translation system

Auteurs : YI CHANG [États-Unis] ; DATONG CHEN [États-Unis] ; YING ZHANG [États-Unis] ; JIE YANG [États-Unis]

Source :

RBID : Pascal:09-0262446

Descripteurs français

English descriptors

Abstract

In this paper, we present a system that automatically translates Arabic text embedded in images into English. The system consists of three components: text detection from images, character recognition, and machine translation. We formulate the text detection as a binary classification problem and apply gradient boosting tree (GBT), support vector machine (SVM), and location-based prior knowledge to improve the F1 score of text detection from 78.95% to 87.05%. The detected text images are processed by off-the-shelf optical character recognition (OCR) software. We employ an error correction model to post-process the noisy OCR output, and apply a bigram language model to reduce word segmentation errors. The translation module is tailored with compact data structure for hand-held devices. The experimental results show substantial improvements in both word recognition accuracy and translation quality. For instance, in the experiment of Arabic transparent font, the BLEU score increases from 18.70 to 33.47 with use of the error correction module.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">An image-based automatic Arabic translation system</title>
<author>
<name sortKey="Yi Chang" sort="Yi Chang" uniqKey="Yi Chang" last="Yi Chang">YI CHANG</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Datong Chen" sort="Datong Chen" uniqKey="Datong Chen" last="Datong Chen">DATONG CHEN</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Ying Zhang" sort="Ying Zhang" uniqKey="Ying Zhang" last="Ying Zhang">YING ZHANG</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Jie Yang" sort="Jie Yang" uniqKey="Jie Yang" last="Jie Yang">JIE YANG</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">09-0262446</idno>
<date when="2009">2009</date>
<idno type="stanalyst">PASCAL 09-0262446 INIST</idno>
<idno type="RBID">Pascal:09-0262446</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000226</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000554</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000210</idno>
<idno type="wicri:doubleKey">0031-3203:2009:Yi Chang:an:image:based</idno>
<idno type="wicri:Area/Main/Merge">000B04</idno>
<idno type="wicri:Area/Main/Curation">000A93</idno>
<idno type="wicri:Area/Main/Exploration">000A93</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">An image-based automatic Arabic translation system</title>
<author>
<name sortKey="Yi Chang" sort="Yi Chang" uniqKey="Yi Chang" last="Yi Chang">YI CHANG</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Datong Chen" sort="Datong Chen" uniqKey="Datong Chen" last="Datong Chen">DATONG CHEN</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Ying Zhang" sort="Ying Zhang" uniqKey="Ying Zhang" last="Ying Zhang">YING ZHANG</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Jie Yang" sort="Jie Yang" uniqKey="Jie Yang" last="Jie Yang">JIE YANG</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue</s1>
<s2>Pittsburgh, PA 15213</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Pattern recognition</title>
<title level="j" type="abbreviated">Pattern recogn.</title>
<idno type="ISSN">0031-3203</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Pattern recognition</title>
<title level="j" type="abbreviated">Pattern recogn.</title>
<idno type="ISSN">0031-3203</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Accuracy</term>
<term>Arabic</term>
<term>Automatic classification</term>
<term>Automatic system</term>
<term>Automatic translation</term>
<term>Character recognition</term>
<term>Data structure</term>
<term>English</term>
<term>Error correction</term>
<term>Image classification</term>
<term>Image processing</term>
<term>Image recognition</term>
<term>Language processing</term>
<term>Learning algorithm</term>
<term>Localization</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Portable equipment</term>
<term>Segmentation</term>
<term>Signal classification</term>
<term>Speech processing</term>
<term>Speech recognition</term>
<term>Support vector machine</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Système automatique</term>
<term>Traduction automatique</term>
<term>Arabe</term>
<term>Anglais</term>
<term>Reconnaissance image</term>
<term>Reconnaissance caractère</term>
<term>Algorithme apprentissage</term>
<term>Machine vecteur support</term>
<term>Localisation</term>
<term>Reconnaissance optique caractère</term>
<term>Correction erreur</term>
<term>Segmentation</term>
<term>Structure donnée</term>
<term>Appareil portatif</term>
<term>Reconnaissance parole</term>
<term>Précision</term>
<term>Classification image</term>
<term>Traitement langage</term>
<term>Reconnaissance forme</term>
<term>Classification signal</term>
<term>Classification automatique</term>
<term>Traitement parole</term>
<term>Traitement image</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Traduction automatique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this paper, we present a system that automatically translates Arabic text embedded in images into English. The system consists of three components: text detection from images, character recognition, and machine translation. We formulate the text detection as a binary classification problem and apply gradient boosting tree (GBT), support vector machine (SVM), and location-based prior knowledge to improve the F1 score of text detection from 78.95% to 87.05%. The detected text images are processed by off-the-shelf optical character recognition (OCR) software. We employ an error correction model to post-process the noisy OCR output, and apply a bigram language model to reduce word segmentation errors. The translation module is tailored with compact data structure for hand-held devices. The experimental results show substantial improvements in both word recognition accuracy and translation quality. For instance, in the experiment of Arabic transparent font, the BLEU score increases from 18.70 to 33.47 with use of the error correction module.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Pennsylvanie</li>
</region>
<settlement>
<li>Pittsburgh</li>
</settlement>
<orgName>
<li>Université Carnegie-Mellon</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Pennsylvanie">
<name sortKey="Yi Chang" sort="Yi Chang" uniqKey="Yi Chang" last="Yi Chang">YI CHANG</name>
</region>
<name sortKey="Datong Chen" sort="Datong Chen" uniqKey="Datong Chen" last="Datong Chen">DATONG CHEN</name>
<name sortKey="Jie Yang" sort="Jie Yang" uniqKey="Jie Yang" last="Jie Yang">JIE YANG</name>
<name sortKey="Ying Zhang" sort="Ying Zhang" uniqKey="Ying Zhang" last="Ying Zhang">YING ZHANG</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A93 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000A93 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:09-0262446
   |texte=   An image-based automatic Arabic translation system
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024